86 research outputs found
Bounded-Distortion Metric Learning
Metric learning aims to embed one metric space into another to benefit tasks
like classification and clustering. Although a greatly distorted metric space
has a high degree of freedom to fit training data, it is prone to overfitting
and numerical inaccuracy. This paper presents {\it bounded-distortion metric
learning} (BDML), a new metric learning framework which amounts to finding an
optimal Mahalanobis metric space with a bounded-distortion constraint. An
efficient solver based on the multiplicative weights update method is proposed.
Moreover, we generalize BDML to pseudo-metric learning and devise the
semidefinite relaxation and a randomized algorithm to approximately solve it.
We further provide theoretical analysis to show that distortion is a key
ingredient for stability and generalization ability of our BDML algorithm.
Extensive experiments on several benchmark datasets yield promising results
Memorization Capacity of Multi-Head Attention in Transformers
In this paper, we investigate the memorization capabilities of multi-head
attention in Transformers, motivated by the central role attention plays in
these models. Under a mild linear independence assumption on the input data, we
present a theoretical analysis demonstrating that an -head attention layer
with a context size , dimension , and parameters can memorize
examples. We conduct experiments that verify our assumptions on the
image classification task using Vision Transformer. To validate our theoretical
findings, we perform synthetic experiments and show a linear relationship
between memorization capacity and the number of attention heads
Scaling Forward Gradient With Local Losses
Forward gradient learning computes a noisy directional gradient and is a
biologically plausible alternative to backprop for learning deep neural
networks. However, the standard forward gradient algorithm, when applied
naively, suffers from high variance when the number of parameters to be learned
is large. In this paper, we propose a series of architectural and algorithmic
modifications that together make forward gradient learning practical for
standard deep learning benchmark tasks. We show that it is possible to
substantially reduce the variance of the forward gradient estimator by applying
perturbations to activations rather than weights. We further improve the
scalability of forward gradient by introducing a large number of local greedy
loss functions, each of which involves only a small number of learnable
parameters, and a new MLPMixer-inspired architecture, LocalMixer, that is more
suitable for local learning. Our approach matches backprop on MNIST and
CIFAR-10 and significantly outperforms previously proposed backprop-free
algorithms on ImageNet.Comment: 30 pages, tech repor
EchoGNN: Explainable Ejection Fraction Estimation with Graph Neural Networks
Ejection fraction (EF) is a key indicator of cardiac function, allowing
identification of patients prone to heart dysfunctions such as heart failure.
EF is estimated from cardiac ultrasound videos known as echocardiograms (echo)
by manually tracing the left ventricle and estimating its volume on certain
frames. These estimations exhibit high inter-observer variability due to the
manual process and varying video quality. Such sources of inaccuracy and the
need for rapid assessment necessitate reliable and explainable machine learning
techniques. In this work, we introduce EchoGNN, a model based on graph neural
networks (GNNs) to estimate EF from echo videos. Our model first infers a
latent echo-graph from the frames of one or multiple echo cine series. It then
estimates weights over nodes and edges of this graph, indicating the importance
of individual frames that aid EF estimation. A GNN regressor uses this weighted
graph to predict EF. We show, qualitatively and quantitatively, that the
learned graph weights provide explainability through identification of critical
frames for EF estimation, which can be used to determine when human
intervention is required. On EchoNet-Dynamic public EF dataset, EchoGNN
achieves EF prediction performance that is on par with state of the art and
provides explainability, which is crucial given the high inter-observer
variability inherent in this task.Comment: Published in MICCAI 202
- …